

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%


\documentclass[12pt]{article}


\usepackage{amsmath, amsthm, amssymb, setspace, fullpage, apacite, enumitem, listings}
\usepackage[margin=0.7in]{geometry}
\usepackage[english]{babel}

\renewcommand{\qedsymbol}{$\scriptstyle \blacksquare$}
\renewcommand{\vec}[1]{\mbox{\boldmath$#1$}}
\newcommand{\dto}{\overset{d}{\to}}
\newcommand{\pto}{\overset{p}{\to}}
\newcommand{\E}{\mathrm{E}}
\newcommand{\F}{\mathfrak{F}}
\newcommand{\I}{\mathrm{I}}
\newcommand{\M}{\mathfrak{M}}
\newcommand{\N}{\mathrm{N}}
\newcommand{\diag}{\mathrm{diag}}
\renewcommand{\P}{\mathrm{P}}
\newcommand{\Q}{\mathrm{Q}}
\newcommand{\Cov}{\mathrm{Cov}}
\newcommand{\Var}{\mathrm{Var}}
\newcommand{\betav}{\vec{\beta}}
\newcommand{\betahat}{\hat{\vec{\beta}}}
\newcommand{\argmax}{\operatornamewithlimits{argmax}}
\newcommand{\argmin}{\operatornamewithlimits{argmin}}
\newcommand{\plim}{\operatornamewithlimits{plim}}
\newcommand{\interior}{\operatornamewithlimits{int}}

\newcommand\independent{\protect\mathpalette{\protect\independenT}{\perp}}
\def\independenT#1#2{\mathrel{\setbox0\hbox{$#1#2$}%
\copy0\kern-\wd0\mkern4mu\box0}}  % statistical independence symbol

\newtheorem{thm}{Theorem}[section]
\newtheorem{corollary}[thm]{Corollary}
\newtheorem{lemma}[thm]{Lemma}
\newtheorem{axiom}[thm]{Axiom}

\theoremstyle{definition}
\newtheorem{defn}[thm]{Definition}

\theoremstyle{definition}
\newtheorem{example}[thm]{Example}

\theoremstyle{definition}
\newtheorem{assumption}[thm]{Assumption}

\theoremstyle{remark}
\newtheorem{remark}[thm]{Remark}


\onehalfspace
%\setlength{\parskip}{1ex}


%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{document}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\title{Causal Inference in Python: A Vignette}
\author{Laurence Wong}
\maketitle

This document illustrates the use of \textit{Causalinference} with a simple simulated data set. We begin with some basic definitions.

\section{Setting and Notation}

As is standard in the literature, we work within the framework of Rubin's potential outcome model \cite{Rubin.1974}.

Let $Y(0)$ denote the potential outcome of a subject in the absence of treatment, and let $Y(1)$ denote the unit's potential outcome when it is treated. Let $D$ denote treatment status, with $D=1$ indicating treatment and $D=0$ indicating control, and let $X$ be a $K$-column vector of covariates or individual characteristics.

For unit $i$, $i=1,2,\ldots,N$, the observed outcome can be written as
\[Y_i = (1-D_i) Y_i(0) + D_i Y_i(1).\]
The set of observables $(Y_i, D_i, X_i)$, $i=1,2,\ldots,N$, forms the basic input data set for \textit{Causalinference}.

\textit{Causalinference} is appropriate for settings in which treatment can be said to be \textit{strongly ignorable}, as defined in Rosenbaum and Rubin \citeyear{RosenbaumRubin.1983}. That is, for all $x$ in the support of $X$, we have
\begin{itemize}
\item[(i)] Unconfoundedness: $D$ is independent of $\big(Y(0), Y(1)\big)$ conditional on $X=x$;
\item[(ii)] Overlap: $c < \P(D=1|X=x) < 1-c$, for some $c>0$.
\end{itemize}

In the following, we illustrate the typical flow of a causal analysis using the tools of \textit{Causalinference} and a simulated data set. In simulating the data, we specified a constant treatment effect of 10 for simplicity, and incorporated systematic overlap issues and nonlinearities to highlight a number of tools in the package. We focus mostly on illustrating the use of \textit{Causalinference}; for details on methodology please refer to Imbens and Rubin \citeyear{ImbensRubin.2015}.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Initialization} \label{sec.a}

The main object of interest in \textit{Causalinference} is the class \texttt{CausalModel}, which we can import with
\begin{verbatim}
  >>> from causalinference import CausalModel
\end{verbatim}
\texttt{CausalModel} takes as inputs three NumPy arrays: \texttt{Y}, an $N$-vector of observed outcomes; \texttt{D}, an $N$-vector of treatment status indicators; and \texttt{X}, an $N$-by-$K$ matrix of covariates. To initialize a \texttt{CausalModel} instance, simply run:
\begin{verbatim}
  >>> causal = CausalModel(Y, D, X)
\end{verbatim}

Once an instance of the class \texttt{CausalModel} has been created, it will contain a number of attributes and methods that are relevant for conducting causal analyses. Tables \ref{tab.a} and \ref{tab.b} contain a brief description of these attributes and methods.

\texttt{CausalModel} is \textit{stateful}. As we employ some of the methods to be discussed subsequently, the instance \texttt{causal} will mutate, with new data being added or existing data being modified or dropped. Running
\begin{verbatim}
  >>> causal.reset()
\end{verbatim}
will return \texttt{causal} to its initial state.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Summary Statistics} \label{sec.b}

Once \texttt{CausalModel} has been instantiated, basic summary statistics will be computed and stored in the attribute \texttt{summary\_stats}. We can display it by running:
\begin{verbatim}
  >>> print(causal.summary_stats)
\end{verbatim}
\begin{verbatim}
  Summary Statistics
  
                         Controls (N_c=392)         Treated (N_t=608)             
         Variable         Mean         S.d.         Mean         S.d.     Raw-diff
  --------------------------------------------------------------------------------
                Y       43.097       31.353       90.911       41.815       47.814
  
                         Controls (N_c=392)         Treated (N_t=608)             
         Variable         Mean         S.d.         Mean         S.d.     Nor-diff
  --------------------------------------------------------------------------------
               X0        3.810        2.950        5.762        2.566        0.706
               X1        3.436        2.848        5.849        2.634        0.880
\end{verbatim}

The attribute \texttt{summary\_stats} is in reality just a dictionary-like object with special method defined to enable the display of the above table. In many situations it is more convenient to simply access the relevant statistic directly. To retrieve the vector of covariate means for the treatment group, for example, we simply run:
\begin{verbatim}
  >>> causal.summary_stats['X_t_mean']
  array([ 5.76232357,  5.8489734 ])
\end{verbatim}

Since \texttt{summary\_stats} behaves like a dictionary, it is equipped with the usual Python dictionary methods. To list the dictionary keys, for instance, we go:
\begin{verbatim}
  >>> causal.summary_stats.keys()
  ['Y_c_mean', 'X_t_sd', 'N_t', 'K', 'ndiff', 'N', 'Y_t_sd', 'rdiff', 'Y_t_mean',
  'X_c_mean', 'X_t_mean', 'Y_c_sd', 'X_c_sd', 'N_c']
\end{verbatim}

Here \texttt{rdiff} refers to the difference in average observed outcomes between treatment and control groups. \texttt{ndiff}, on the other hand, refers to the normalized differences in average covariates, defined as
\[\frac{\bar{X}_{k,t} - \bar{X}_{k,c}}{\sqrt{\left(s^2_{k,t}+s^2_{k,c}\right)\Big/ 2}},\]
where $\bar{X}_{k,t}$ and $s_{k,t}$ are the sample mean and sample standard deviation of the $k$th covariate of the treatment group, and $\bar{X}_{k,c}$ and $s_{k,c}$ are the analogous statistics for the control group.

The normalized differences in average covariates provide a way to measure the covariate balance between the treatment and the control groups. Unlike the t-statistic, its absolute magnitude does not increase (in expectation) as the sample size increases.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Least Squares Estimation}

One of the simplest treatment effect estimators is the ordinary least squares (OLS) estimator. \textit{Causalinference} provides several common regression specifications.

By default, the method \texttt{est\_via\_ols} will run the following regression:
\[Y_i = \alpha + \beta D_i + \gamma' (X_i-\bar{X}) + \delta' D_i (X_i-\bar{X}) + \varepsilon_i.\]

To inspect any treatment effect estimates produced, we can simply invoke \texttt{print} on the attribute \texttt{estimates}, as in below:
\begin{verbatim}
  >>> causal.est_via_ols()
  >>> print(causal.estimates)
\end{verbatim}
\begin{verbatim}
  Treatment Effect Estimates: OLS
  
                       Est.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
             ATE      3.672      0.906      4.051      0.000      1.895      5.449
             ATC     -0.227      0.930     -0.244      0.807     -2.050      1.596
             ATT      6.186      1.067      5.799      0.000      4.095      8.277
\end{verbatim}
Here ATE, ATC, and ATT stand for, respectively, average treatment effect, average treatment effect for the controls, and average treatment effect for the treated. Like \texttt{summary\_stats}, the attribute \texttt{estimates} is a dictionary-like object that contains the estimation results.

Including interaction terms between the treatment indicator $D$ and covariates $X$ implies that treatment effects can differ across individuals. In some instances we may want to assume a constant treatment effect, and only run
\[Y_i = \alpha + \beta D_i + \gamma' (X_i-\bar{X}_i) + \varepsilon_i.\]
This can be achieved by supplying a value of 1 in \texttt{est\_via\_ols} to the optional parameter \texttt{adj} (its default value is 2). To compute the raw difference in average outcomes between treatment and control groups, we can set \texttt{adj=0}.

In this example, the least squares estimates are radically different from the true treatment effect of 10. This is the result of the nonlinearity and non-overlap issues intentionally introduced into the data simulation process. As we shall see, several other tools exist in \textit{Causalinference} that can better deal with a lack of overlap and that will allow us to obtain estimates that are less sensitive to functional form assumptions.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Propensity Score Estimation} \label{sec.c}

The probability of getting treatment conditional on the covariates, $p(X_i) = \P(D_i=1|X_i)$, also known as the propensity score, plays a central role in much of what follows. Two methods, \texttt{est\_propensity} and \texttt{est\_propensity\_s}, are provided for propensity score estimation. Both involve running a logistic regression of the treatment indicator $D$ on functions of the covariates. \texttt{est\_propensity} allows the user to specify the covariates to include linearly and/or quadratically, while \texttt{est\_propensity\_s} will make this choice automatically based on a sequence of likelihood ratio tests.

In the following, we run \texttt{est\_propensity\_s} and display the estimation results. In this example, the specification selection algorithm decided to include both covariates and all the interaction and quadratic terms.

\begin{verbatim}
  >>> causal.est_propensity_s()
  >>> print(causal.propensity)
\end{verbatim}
\begin{verbatim}
  Estimated Parameters of Propensity Score
  
                      Coef.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
       Intercept     -2.839      0.526     -5.401      0.000     -3.870     -1.809
              X1      0.486      0.153      3.178      0.001      0.186      0.786
              X0      0.466      0.155      3.011      0.003      0.163      0.770
           X1*X0      0.080      0.015      5.391      0.000      0.051      0.109
           X0*X0     -0.045      0.012     -3.579      0.000     -0.069     -0.020
           X1*X1     -0.045      0.013     -3.542      0.000     -0.070     -0.020
\end{verbatim}

The \texttt{propensity} attribute is again another dictionary-like container of results. The dictionary keys of \texttt{propensity} can be found by running:
\begin{verbatim}
  >>> causal.propensity.keys()
  ['coef', 'lin', 'qua', 'loglike', 'fitted', 'se']
\end{verbatim}
The selected linear and quadratic terms are contained in the lists \texttt{causal.propensity['lin']} and \texttt{causal.propensity['qua']}. Though we won't make direct calls to it, most of the propensity-based techniques discussed subsequently are based on \texttt{causal.propensity['fitted']}, the vector of estimated propensity scores.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Improving Covariate Balance} \label{sec.d}

When there is indication of covariate imbalance, we may wish to construct a sample where the treatment and control groups are more similar than the original full sample. One way of doing so is by dropping units with extreme values of propensity score. For these subjects, their covariate values are such that the probability of being in the treatment (or control) group is so overwhelmingly high that we cannot reliably find comparable units in the opposite group. We may wish to forego estimating treatment effects for such units since nothing much can be credibly said about them.

A good rule-of-thumb is to drop units whose estimated propensity score is less than $\alpha=0.1$ or greater than $1-\alpha=0.9$. By default, once the propensity score has been estimated by running either \texttt{est\_propensity} or \texttt{est\_propensity\_s}, a value of 0.1 will be set for the attribute \texttt{cutoff}:

\begin{verbatim}
  >>> causal.cutoff
  0.1
\end{verbatim}

Calling \texttt{causal.trim()} at this point will drop every unit that has propensity score outside of the $[\alpha, 1-\alpha]$ interval. Alternatively, a procedure exists that will estimate the optimal cutoff that minimizes the asymptotic sampling variance of the trimmed sample. The method \texttt{trim\_s} will perform this calculation, set the \texttt{cutoff} to the optimal $\alpha$, and then invoke \texttt{trim} to construct the subsample. For our example, the optimal $\alpha$ was estimated to be slightly less than 0.1:
\begin{verbatim}
  >>> causal.trim_s()
  >>> causal.cutoff
  0.0954928016329
\end{verbatim}
The complexity of this cutoff selection algorithm is only $O(N \log N)$, so in practice there is very little reason to not employ it.

If we now print \texttt{summary\_stats} again to view the summary statistics of the trimmed sample, we see that the normalized differences in average covariates has fallen noticeably.
\begin{verbatim}
  >>> print(causal.summary_stats)
\end{verbatim}
\begin{verbatim}
  Summary Statistics
  
                         Controls (N_c=371)         Treated (N_t=363)             
         Variable         Mean         S.d.         Mean         S.d.     Raw-diff
  --------------------------------------------------------------------------------
                Y       41.331       29.608       66.067       28.108       24.736
  
                         Controls (N_c=371)         Treated (N_t=363)             
         Variable         Mean         S.d.         Mean         S.d.     Nor-diff
  --------------------------------------------------------------------------------
               X0        3.709        2.872        4.658        2.522        0.351
               X1        3.407        2.784        4.661        2.517        0.472
\end{verbatim}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Stratifying the Sample} \label{sec.e}

With the propensity score estimated, one may wish to stratify the sample into blocks that have units that are more similar in terms of their covariates. This makes the treatment and control groups within each propensity bin more comparable, and therefore treatment effect estimates more credible.

\textit{Causalinference} provides two methods for subclassification based on propensity score. The first, \texttt{stratify}, splits the sample based on what is specified in the attribute \texttt{blocks}. The default value of \texttt{blocks} is set to 5, which means that \texttt{stratify} will split the sample into 5 equal-sized bins. In contrast, the second method, \texttt{stratify\_s}, will use a data-driven procedure for selecting both the number of blocks and their boundaries, with the expectation that the number of blocks should increase with the sample size. Operationally this method is a divide-and-conquer algorithm that recursively divides the sample into two until there is no significant advantage of doing so. This algorithm also runs in $O(N \log N)$ time, so costs relatively little to use. 

To inspect the results of the stratification, we can invoke \texttt{print} on the attribute \texttt{strata} to display some summary statistics, as follows:
\begin{verbatim}
  >>> causal.stratify_s()
  >>> print(causal.strata)
\end{verbatim}
\begin{verbatim}
  Stratification Summary
  
                Propensity Score         Sample Size     Ave. Propensity   Outcome
     Stratum      Min.      Max.  Controls   Treated  Controls   Treated  Raw-diff
  --------------------------------------------------------------------------------
           1     0.095     0.265       157        28     0.188     0.187    11.885
           2     0.266     0.474       111        72     0.360     0.367    12.025
           3     0.477     0.728        70       113     0.598     0.601    11.696
           4     0.728     0.836        23        69     0.781     0.787    10.510
           5     0.838     0.904        10        81     0.865     0.873     3.405
\end{verbatim}

Under the hood, the attribute \texttt{strata} is actually a list-like object that contains, as each of its elements, a full instance of the class \texttt{CausalModel}, with the input data being those that correspond to the units that are in the propensity bin. We can thus, for example, access each stratum and inspect its \texttt{summary\_stats} attribute, or as the following illustrates, loop through \texttt{strata} and estimate within-bin treatment effects using least squares.
\begin{verbatim}
  >>> for stratum in causal.strata:
  ...     stratum.est_via_ols(adj=1)
  ... 
  >>> [stratum.estimates['ols']['ate'] for stratum in causal.strata]
  [10.379170390195197, 9.2918973715823707, 9.67876709257445, 9.6722830043583023,
  9.2239596078238222]
\end{verbatim}

Note that these estimates are much more stable and closer to the true value of 10 than the within-bin raw differences in average outcomes that were reported in the stratification summary table, highlighting the virtue of further controlling for covariates even within blocks.

Taking the sample-weighted average of the above within-bin least squares estimates results in a propensity score matching estimator that is commonly known as the subclassification estimator or blocking estimator. However, instead of manually looping through the \texttt{strata} attribute, estimating within-bin treatment effects, and then averaging appropriately to arrive at an overall estimate, we can also simply call \texttt{est\_via\_blocking}, which will perform these operations and collect the results in the attribute \texttt{estimates}. We will report these estimates in the next section along with estimates obtained from other, alternative estimators.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\section{Treatment Effect Estimation} \label{sec.f}

In addition to least squares and the blocking estimator described in the last section, \textit{Causalinference} provides two alternative treatment effect estimators. The first is the nearest neighborhood matching estimator of Abadie and Imbens \citeyear{AbadieImbens.2006}. Instead of relying on the propensity score, this estimator pairs treatment and control units by matching directly on the covariate vectors themselves. More specifically, each unit $i$ in the sample is matched with a unit $m(i)$ in the opposite group, where
\[m(i) = \argmin_{j: D_j \neq D_i} \|X_j - X_i\|,\]
and $\|X_j - X_i\|$ is some measure of distance between the covariate vectors $X_j$ and $X_i$. The method \texttt{est\_via\_matching} implements this estimator, as well as several extensions that can be invoked through optional arguments.

The last estimator is a version of the Horvitz-Thompson weighting estimator, modified to further adjust for covariates. Mechanically, this involves running the following weight least squares regression:
\[Y_i = \alpha + \beta D_i + \gamma' X_i + \varepsilon_i,\]
where the weight for unit $i$ is $1/\hat{p}(X)$ if $i$ is in the treatment group, and $1/\big(1-\hat{p}(X)\big)$ if $i$ is in the control group. This estimator is also sometimes called the doubly-robust estimator, referring to the fact that this estimator is consistent if either the specification of the propensity score is correct, or the specification of the regression function is correct. We can invoke it by calling \texttt{est\_via\_weighting}. Note that under this specification the treatment effect does not differ across units, so the ATC and the ATT are both equal to the overall ATE.

In the following we invoke each of the four estimators (including least squares, since the input data has changed now that the sample has been trimmed), and print out the resulting estimates.
\begin{verbatim}
  >>> causal.est_via_ols()
  >>> causal.est_via_weighting()
  >>> causal.est_via_blocking()
  >>> causal.est_via_matching(bias_adj=True)
  >>> print(causal.estimates)
\end{verbatim}
\begin{verbatim}
  Treatment Effect Estimates: OLS
  
                       Est.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
             ATE      2.913      0.803      3.627      0.000      1.339      4.487
             ATC      2.435      0.824      2.956      0.003      0.820      4.049
             ATT      3.401      0.885      3.843      0.000      1.667      5.136
\end{verbatim}
\begin{verbatim}
  Treatment Effect Estimates: Weighting
  
                       Est.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
             ATE     17.821      1.684     10.585      0.000     14.521     21.121
\end{verbatim}
\begin{verbatim}
  Treatment Effect Estimates: Blocking
  
                       Est.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
             ATE      9.702      0.381     25.444      0.000      8.954     10.449
             ATC      9.847      0.527     18.701      0.000      8.815     10.879
             ATT      9.553      0.332     28.771      0.000      8.903     10.204
\end{verbatim}
\begin{verbatim}
  Treatment Effect Estimates: Matching
  
                       Est.       S.e.          z      P>|z|      [95% Conf. int.]
  --------------------------------------------------------------------------------
             ATE      9.624      0.245     39.354      0.000      9.145     10.103
             ATC      9.642      0.270     35.776      0.000      9.114     10.170
             ATT      9.606      0.318     30.159      0.000      8.981     10.230
\end{verbatim}

As we can see above, despite the trimming the least squares estimates are still severely biased, as is the weighting estimator (since neither the propensity score or the regression function is correctly specified). The blocking and matching estimators, on the other hand, are less sensitive to specification assumptions, and thus result in estimates that are closer to the true average treatment effects.

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\bibliographystyle{apacite}
\bibliography{references}

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\begin{table}[ht]
\begin{center}\begin{tabular}{ll}
Attribute & Description \\
\texttt{summary\_stats} & Dictionary-like object containing summary statistics for the \\
& outcome variable and the covariates. \\
\texttt{propensity} & Dictionary-like object containing propensity score data, \\
& including estimated logistic regression coefficients, predicted \\
& propensity score, maximized log-likelihood, and the lists of the \\
& linear and quadratic terms that are included in the regression. \\
\texttt{cutoff} & Floating point number specifying the cutoff point for trimming \\
& on propensity score.\\
\texttt{blocks} & Either an integer indicating the number of equal-sized blocks to \\
& stratify the sample into, or a list of ascending numbers specifying \\
& the boundaries of each stratum. \\
\texttt{strata} & List-like object containing the list of stratified propensity bins. \\
\texttt{estimates} & Dictionary-like object containing treatment effect estimates for \\
& each estimator used.
\end{tabular}\end{center}
\caption{Attributes of the class \texttt{CausalModel}. Invoking \texttt{print} on any of the dictionary- or list-like attribute above yields customized summary tables.}  \label{tab.a}
\end{table}

\begin{table}[ht]
\begin{center}\begin{tabular}{ll}
Method & Description \\
\texttt{reset} & Reinitializes data to original inputs, and drops any estimated results. \\
\texttt{est\_propensity} & Estimates via logistic regression the propensity score using specified \\
& linear and quadratic terms. \\
\texttt{est\_propensity\_s} & Estimates via logistic regression the propensity score using the \\
& covariate selection algorithm of Imbens and Rubin \citeyear{ImbensRubin.2015}. \\
\texttt{trim} & Trims data based on propensity score using the threshold specified \\
& by the attribute \texttt{cutoff}. \\
\texttt{trim\_s} & Trims data based on propensity score using the cutoff selected by \\
& the procedure of Crump, Hotz, Imbens, and Mitnik \citeyear{CrumpHotzImbensMitnik.2009}. \\
\texttt{stratify} & Stratifies the sample based on propensity score as specified by \\
& the attribute \texttt{blocks}. \\
\texttt{stratify\_s} & Stratifies the sample based on propensity score using the bin \\
& selection procedure suggested by Imbens and Rubin \citeyear{ImbensRubin.2015}. \\
\texttt{est\_via\_ols} & Estimates average treatment effects using least squares. \\
\texttt{est\_via\_weighting} & Estimates average treatment effects using the Horvitz-Thompson \\
& weighting estimator modified to incorporate covariates. \\
\texttt{est\_via\_blocking} & Estimates average treatment effects using regression within blocks. \\
\texttt{est\_via\_matching} & Estimates average treatment effects using matching with replacement.
\end{tabular}\end{center}
\caption{Methods of the class \texttt{CausalModel}. Invoke \texttt{help} on any of the above methods for more detailed documentation.}  \label{tab.b}
\end{table}

\clearpage

%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

\end{document}

